Load some packages that we’ll need to use to do these calculations:
library(tidyverse)
library(gifski)
library(ggraph)
library(here)
library(igraph)
source(here("modelFunction_rewiring.R"))
# Define parameters
N = 50
edge.prob <- 0.04
burn.in = 20
burn.out = 5
pm = 0.3
ps = 0.1
pa = 0.2
add00 = c(0.5, 10)
lose01 = 0.1
add10 = 0.05
lose11 = c(0.5, 0.5)
histMultiplier = 1.2
doRemoval = TRUE
modelGraphs <- runModel(N = N, # Nodes in the network
edge.prob = edge.prob,
burn.in = burn.in,
burn.out = burn.out,
pm = pm,
ps = ps,
pa = pa,
add00 = add00,
lose01 = lose01,
add10 = add10,
lose11 = lose11,
histMultiplier = histMultiplier,
doRemoval = doRemoval) %>%
lapply(., function(x){
igraph::graph_from_adjacency_matrix(x, mode = "undirected", add.colnames = "label")
})
Because I have not given individuals different tendencies to associate with others, we don’t expect to see consistent individual variation in betweenness or degree. It should emerge in each time slice from the random interactions, but there’s no reason it should be consistent across time slices.
Compute degree and betweenness for each of the model networks, and visualize how they change over time.
## Warning: Removed 1 row(s) containing missing values (geom_path).
cluster_fast_greedy() deal with isolated
nodes?We can see from this plot that throughout the model, we have some isolated nodes.
The question is, what does the clustering algorithm do to them?
Well, we can also see that there are clusters of size 1:
Indeed, there’s a clear 1:1 relationship:
df <- data.frame(slice = 1:length(nIsolatedNodes),
nIsolatedNodes = nIsolatedNodes,
nSingletonClusters = nSingletonClusters)
df %>%
ggplot(aes(x = nIsolatedNodes, y = nSingletonClusters))+
geom_point()+
geom_line()+
theme_minimal()
This means that isolated nodes are treated as their own clusters by the clustering algorithm.
(Note: this also suggests that in this model, a connected node will never be its own cluster. I don’t think I can conclude that this is true over the whole parameter space of this model. But it’s true at least at these sizes of N and connection probabilities.)
Okay, so, these isolated nodes could be affecting the modularity calculations. What if we compute modularity on networks with the isolated nodes removed?
How does modularity behave over time when we remove isolated nodes?
clustered_noIso <- lapply(noIso, function(x){
cluster_fast_greedy(x)
})
mods_noIso <- data.frame(slice = 1:length(clustered_noIso),
modularity = unlist(lapply(clustered_noIso, modularity)),
nClusters = unlist(lapply(clustered_noIso, length)),
nIndivs = unlist(lapply(clustered_noIso, function(x){length(membership(x))})))
allModularities <- bind_rows(modularities %>% mutate(allowIso = "Original networks"),
mods_noIso %>% mutate(allowIso = "Isolated nodes removed"))
allModularities %>%
ggplot(aes(x = slice, y = modularity, col = allowIso))+
geom_line(col = moduColor, size = 1)+
theme_minimal()+
facet_wrap(~allowIso)+
geom_vline(aes(xintercept = burn.in+1), size = 0.5, lty = 2)+
ylab("Modularity")+
xlab("Time slice")
These graphs are exactly the same! It seems that isolated nodes are being treated as their own clusters, but they are not affecting the modularity calculation.
To understand this, I need to go back to the mathematical formula that is being used here. But I suspect it has something to do with 0’s and 1’s and the ratios between them all working out nicely. To discuss more with Noa.
First, I run the model 100 times and compute the network measures for each of the model runs.
Now, I can make some plots to detect general trends in what happens to the network after removal/rewiring.
densDF %>%
ggplot(aes(x = slice, y = density))+
geom_line(aes(group = run), col = densColor, size = 0.2, alpha = 0.5)+
theme_minimal()+
geom_vline(aes(xintercept = burn.in+1), size = 0.7, lty = 2)+
scale_color_viridis_c()+
ylab("Density")+
xlab("Time slice")
mdDF %>%
ggplot(aes(x = slice, y = meanDistance))+
geom_line(aes(group = run), col = mdColor, size = 0.2, alpha = 0.5)+
theme_minimal()+
geom_vline(aes(xintercept = burn.in+1), size = 0.7, lty = 2)+
scale_color_viridis_c()+
ylab("Mean Distance")+
xlab("Time slice")
## Warning: Removed 100 row(s) containing missing values (geom_path).
moduDF %>%
ggplot(aes(x = slice, y = modularity))+
geom_line(aes(group = run), col = moduColor, size = 0.2, alpha = 0.5)+
theme_minimal()+
geom_vline(aes(xintercept = burn.in+1), size = 0.7, lty = 2)+
scale_color_viridis_c() +
ylab("Modularity")+
xlab("Time slice")
moduDF %>%
filter(slice > 3) %>%
ggplot(aes(x = slice, y = nClusters))+
geom_line(aes(group = run), col = moduColor, size = 0.2, alpha = 0.5)+
theme_minimal()+
geom_vline(aes(xintercept = burn.in+1), size = 0.7, lty = 2)+
scale_color_viridis_c()+
ylab("Number of clusters")+
xlab("Time slice")
First, let’s check whether degree and betweenness are indeed highly correlated. We suspect that they are.
Indeed, these are highly correlated. Expect similar effects.
What about the ratio between the first and second changes? Aka: what percentage of the loss/gain is recovered by the rewiring?